Incorporating New Evidence and Uncertainty

Backwards Conversion, Solving for PSA Parameters and Copula-Based PSA Sampling

Learning Objectives

  1. Back-convert an existing transition probability matrix to incorporate new health states, strategies, and evidence.
  2. Solve for probabilisitic sensitivity analysis (PSA) distribution parameters.
  3. Sample correlated PSA distributions using copulas.

1. Backwards Conversion

  • Lectures 2 and 3 emphasized the importance of the transition rate matrix as the central “hub” of a Markov model.

Goal

What Rarely Happens

Adapting a Model Requires a Transition Rate Matrix

1. Backwards Conversion

1. Backwards Conversion

  • If starting a model from scratch, can simply define or estimate rates, and use them to construct the rate matrix.

1. Backwards Conversion

  • What if we have a model that is already defined in terms of a transition probability matrix?
  • How can we convert back to a rate matrix to add new health states, transition states, accumulators, etc. as needed?
  • Also useful if we want to keep everything the same, but change the model time step (e.g., see Chhatwal, Jayasuriya, and Elbasha (2016))

1. Backwards Conversion

  • We will next show you several ways to work backwards.
  • Boils down to solving for the continuous “generator matrix” for the observed transition probability matrix.

A Word of Warning

  • As you’ll see, it’s not always going to work perfectly.
  • If the original transition probability matrix was defined incorrectly (e.g., no jumpover probabilities), the generator matrix may not exist.
  • Identifiability is a more general issue, however.

Working Example: HIV Model

Transition probability matrix:

        Healthy LowCD4  AIDS Death
Healthy   0.721  0.202 0.067 0.010
LowCD4    0.000  0.581 0.407 0.012
AIDS      0.000  0.000 0.750 0.250
Death     0.000  0.000 0.000 1.000
  • From ‘Decision Modelling for Health Economic Evaluation’ by Briggs, Claxton, Sculpher.

Three Methods to Solve for the Generator

  • A. Compute the matrix logarithm of the transition probability matrix \(P\).

  • B. Maximum likelihood-based approach.

  • C. Convert the supplied transition probabilities to rates one-by-one.

  • (You can always try 2+ of these methods to see how they compare.)

A. Matrix Logarithm of P

  • In mathematical terms, the generator matrix is the matrix logarithm of the transition probability matrix.
  • A matrix has a logarithm if and only if if it is invertible.

A. Matrix Logarithm of P

  • In mathematical terms, the generator matrix is the matrix logarithm of the transition probability matrix.
  • A matrix has a logarithm if and only if if it is invertible.

\[ R = \log P \]

A. Matrix Logarithm of P

  • The \(\log\) can be found using spectral or eigenvalue decomposition.
  • If \(V\) is a matrix where each column is an eigenvector of \(P\), then

A. Matrix Logarithm of P

  • The \(\log\) can be found using spectral or eigenvalue decomposition.
  • If \(V\) is a matrix where each column is an eigenvector of \(P\), then

\[A' = V^{-1} A V\]

A. Matrix Logarithm of P

  • The \(\log\) can be found using spectral or eigenvalue decomposition.
  • If \(V\) is a matrix where each column is an eigenvector of \(P\), then

\[A' = V^{-1} A V\]

\[\log P = V (\log A') V^{-1}\]

A. Matrix Logarithm of P

V  <- eigen(mP)$vectors
iV <- solve(V)
Ap <- iV %*% mP %*% V
lAp <- diag(log(diag(Ap)), nrow(Ap), ncol(Ap))
R  <- V %*% lAp %*% iV
R[abs(R) < 1e-6 ] <- 0
dimnames(R) = dimnames(mP)
round(R,3)

A. Matrix Logarithm of P

V  <- eigen(mP)$vectors
iV <- solve(V)
Ap <- iV %*% mP %*% V
lAp <- diag(log(diag(Ap)), nrow(Ap), ncol(Ap))
R  <- V %*% lAp %*% iV
R[abs(R) < 1e-6 ] <- 0
dimnames(R) = dimnames(mP)
round(R,3)
        Healthy LowCD4   AIDS  Death
Healthy  -0.327  0.311  0.002  0.013
LowCD4    0.000 -0.543  0.615 -0.072
AIDS      0.000  0.000 -0.288  0.288
Death     0.000  0.000  0.000  0.000

A. Matrix Logarithm of P

Healthy LowCD4 AIDS Death
Healthy -0.327 0.311 0.002 0.013
LowCD4 0 -0.543 0.615 -0.072
AIDS 0 0 -0.288 0.288
Death 0 0 0 0
  • Note there is a negative rate from LowCD4 Death!
  • This is a transition from Death LowCD4 and should be moved to the other side of the matrix.

A. Matrix Logarithm of P

Healthy LowCD4 AIDS Death
Healthy -0.327 0.311 0.002 0.013
LowCD4 0 -0.543 0.615 -0.072
AIDS 0 0 -0.288 0.288
Death 0 0 0 0
  • The negative rate implies an identifiability issue.
  • Note rates from HealthyAIDS, HealthyDeath are relatively small, implying these are from statistical noise in observation.

A. Matrix Logarithm of P

  • expm::logm Higham (2008) method returns same as eigenvalue method.
  • Tweaking rates to get a proper model is ad-hoc and difficult.

A. Matrix Logarithm of P

  • Higham (2008) method returns same as eigenvalue method. - expm::logm()
  • Tweaking rates to get a proper model is ad-hoc and difficult.
R = expm::logm(mP)
        Healthy LowCD4   AIDS  Death
Healthy  -0.327  0.311  0.002  0.013
LowCD4    0.000 -0.543  0.615 -0.072
AIDS      0.000  0.000 -0.288  0.288
Death     0.000  0.000  0.000  0.000

B. Multi-State Model Based Approach

  • Imposition of structural assumptions and fitting to pseudo-data derived from original Markov model can result in reasonable rate estimates.
  • We could assume that a patient with HIV at this point in time only gets sicker and external causes of death are negligible. Healthy LowCD4 AIDS Death constrains model.
  • We use the reported data from the original model at year 1.

B. Multi-State Model Based Approach

Steps:

  1. Run the original model for 1+ cycles to obtain the Markov trace.
  2. Construct psuedo-data from the resulting trace based on a cohort with reasonable size (e.g., 1,000 patients)
  3. Estimate transition hazards in the pseudo-data based on a multistate model.1
  4. Use the estimated transition hazards to construct the rate matrix.

1. Run the original model

tr0 <- # Initial state occupancy.
  c("Healthy" = 1000,"LowCD4" = 0, "AIDS" = 0, "Death" = 0)
tr1 <- # State occupancy after one cycle.
  tr0 %*% mP 

tr <- # Bind together into a data frame. 
    rbind.data.frame(tr0, tr1) %>%
    mutate(cycle = c(0, 1)) %>% 
    select(cycle,everything())
tr
  cycle Healthy LowCD4 AIDS Death
1     0    1000      0    0     0
2     1     721    202   67    10

2. Construct psuedo-data

  • Idea: create data for 1,000 simulated “patients” followed for two cycles.
  • Counts in the Markov trace govern state occupancy in each cycle.
    • In first cycle, all 1,000 are in Healthy state.
    • In second cycle, 721 remain in Healthy while 202 are in LowCD4, etc.

2. Construct psuedo-data

tr <- 
    rbind.data.frame(tr0, tr1) %>%
    mutate(cycle = c(0, 1)) %>% 
    gather(state,count,-cycle) %>% 
    mutate(count = round(count,0)) %>% 
    arrange(cycle) %>% 
    lapply(.,rep,.$count) %>% 
    cbind.data.frame() %>% 
    as_tibble() %>% 
    mutate(state = factor(state,
      levels = c("Healthy","LowCD4","AIDS","Death"))) %>% 
    arrange(cycle,state) %>% 
    mutate(ptnum = rep(1:1000,2)) %>% 
    select(-count) %>% 
    arrange(ptnum,cycle) %>% 
    mutate(state = as.numeric(state)) %>% 
    select(ptnum,cycle,state)
# A tibble: 2,000 × 3
   ptnum cycle state
   <int> <dbl> <dbl>
 1     1     0     1
 2     1     1     1
 3     2     0     1
 4     2     1     1
 5     3     0     1
 6     3     1     1
 7     4     0     1
 8     4     1     1
 9     5     0     1
10     5     1     1
# ℹ 1,990 more rows

3. Fit Multistate Model

  • Next, fit a multistate model to the pseudo-data.
  • The multistate model will use likelihood-based methods to estimate transition intensities among health states in the data.
  • These transition intensities are the rates that you can use in the rate matrix!

3. Fit Multistate Model

  1. Define the state table for the multistate model
library(msm)
statetable.msm(state, ptnum, data=tr)
    to
from   1   2   3   4
   1 721 202  67  10
  1. Check that cell transition counts match the trace.
# Markov trace
rbind.data.frame(tr0, tr1) %>%
    mutate(cycle = c(0, 1)) %>% 
    select(cycle,everything())
  cycle Healthy LowCD4 AIDS Death
1     0    1000      0    0     0
2     1     721    202   67    10

3. Fit Multistate Model

  • Initial rate guesses are based on the eigenvalue method, i.e., result of expm(mP).
Q.init <- rbind(rbind(c(0, 0.3, 0,   .0001),    
                      c(0, 0,   0.6, 0.01),    
                      c(0, 0,   0,   0.3),  
                      c(0, 0,   0,   0)))
dimnames(Q.init) = dimnames(mP)
  • Fit the model
hiv.msm <- 
  msm(state ~ cycle, subject = ptnum, data = tr, qmatrix = Q.init)

3. Fit Multistate Model

hiv.msm

Call:
msm(formula = state ~ cycle, subject = ptnum, data = tr, qmatrix = Q.init)

Maximum likelihood estimates

Transition intensities
                  Baseline                          
Healthy - Healthy -0.32712 ( -3.680e-01, -2.907e-01)
Healthy - LowCD4   0.32702 (  1.354e-01,  7.898e-01)
Healthy - Death    0.00010 (  0.000e+00,        Inf)
LowCD4 - LowCD4   -0.64477 ( -1.139e+01, -3.650e-02)
LowCD4 - AIDS      0.63465 (  9.115e-06,  4.419e+04)
LowCD4 - Death     0.01012 ( 8.238e-299, 1.242e+294)
AIDS - AIDS       -0.34837 ( -3.805e+40, -3.189e-42)
AIDS - Death       0.34837 (  3.189e-42,  3.805e+40)

-2 * log-likelihood:  1572.2 

4. Construct the Rate Matrix

Rmsm <- msm::qmatrix.msm(hiv.msm, ci = "none")
round(Rmsm,3)
        Healthy LowCD4   AIDS Death
Healthy  -0.327  0.327  0.000 0.000
LowCD4    0.000 -0.645  0.635 0.010
AIDS      0.000  0.000 -0.348 0.348
Death     0.000  0.000  0.000 0.000

5. Embed the Transition Probability Matrix

Multistate-Model:

round(expm(Rmsm),3)
        Healthy LowCD4  AIDS Death
Healthy   0.721  0.202 0.067 0.010
LowCD4    0.000  0.525 0.388 0.088
AIDS      0.000  0.000 0.706 0.294
Death     0.000  0.000 0.000 1.000

Original Model Matrix:

round(mP,3)
        Healthy LowCD4  AIDS Death
Healthy   0.721  0.202 0.067 0.010
LowCD4    0.000  0.581 0.407 0.012
AIDS      0.000  0.000 0.750 0.250
Death     0.000  0.000 0.000 1.000

Markov Trace Comparison

# Original Model
c(1000, 0, 0, 0) %*% mP
     Healthy LowCD4 AIDS Death
[1,]     721    202   67    10
# MSM-Based Model
c(1000, 0, 0, 0) %*% pmatrix.msm(hiv.msm, t=1)
     Healthy LowCD4 AIDS Death
[1,]     721    202   67    10
  • Resulting expectation after 1 cycle is nearly identical.
  • The msm method has bits down at 4th decimal. Round off made them identical.
  • Only a tiny bit of error (721 vs. 721.0005) makes log numerically unstable converting from probability to rate.

Key Takeaways

  • By imposing some constraints on the underlying transitions, we were able to yield a generator matrix that makes sense!
  • Critically, our multistate model-based generator matrix closely approximates the original transition probability matrix, but does not imply negative rates that bring people back from death.

Key Takeaways

Multistate-Model:

round(logm(mP),3)
       [,1]   [,2]   [,3]   [,4]
[1,] -0.327  0.311  0.002  0.013
[2,]  0.000 -0.543  0.615 -0.072
[3,]  0.000  0.000 -0.288  0.288
[4,]  0.000  0.000  0.000  0.000

Original Model Matrix:

round(Rmsm,3)
        Healthy LowCD4   AIDS Death
Healthy  -0.327  0.327  0.000 0.000
LowCD4    0.000 -0.645  0.635 0.010
AIDS      0.000  0.000 -0.348 0.348
Death     0.000  0.000  0.000 0.000

Key Takeaways

  • With a reasonable generator matrix defined, we can now augment the model as we see fit:

    • Add additional health states.
    • Add new strategies with evidence on efficiacy drawn from the literature (e.g., hazard rates).
    • Add accumulators and transition states.
    • Change the cycle length.
  • Once done with the above, just embed the new matrix.

2. Solving for PSA Distributions

A Common Issue

  • We have defined our underlying model (either from bottom up or through backwards conversion) but need to define PSA distributions for model parameters.

  • Model draws on literature-based parameters that are reported as means, standard deviations, interquartile ranges, 95% confidence intervals, etc.

A Common Issue

  • Straightforward to obtain base case values from literature.

  • But how can we define PSA distributions based on limited information?

PSA Distributions

Parameter Type Distribution
Probability beta
Rate gamma
Utility weight beta
Right skew (e.g., cost) gamma, lognormal
Relative risks or hazard ratios lognormal
Odds Ratio logistic

Example

  • Cost parameter reported in literature has interquartile range of $300-$750.
  • What are the parameters of a PSA distribution such that the 25th percentile is $300 and the 75th percentile is $750?

Some Options

  • Analytic formulas for some distributions (normal, gamma, beta).
  • Formulas take as their inputs the two values (\(x_1\), \(x_2\)) and their associated quantiles (\(p_1\),\(p_2\)).
  • Formulas return the PSA distribution parameters that match up to these values.

Some Options

  • ParameterSolver implemented in Windows, Link

  • See Cook (2010) and Vasco Grilo’s blog post for more.

  • The next few slides provide you with nearly all the tools you’ll need, however.

Example: Cost PSA Distribution

  • Suppose the interquartile range for a key cost variable is [300,750]
  • We may have obtained this from the literature, from tabulating cost data, or from expert opinions.

Example: Cost PSA Distribution

  • Suppose the interquartile range for a key cost variable is [300,750]
  • We may have obtained this from the literature, from tabulating cost data, or from expert opinions.
x1 = 300
p1 = 0.25

x2 = 750
p2 = 0.75

Analytic Solution: Uniform PSA

For a uniform distribution with minimum \(a\) and maximum \(b\)

\[ a = \frac{p_2x_1 - p_1x_2}{p_2-p_1} \] \[ b = \frac{(1-p_1)x_2-(1-p_2)x_1}{p_2-p_1} \]

Analytic Solution: Uniform PSA

a = ((p2*x1) - (p1 * x2)) / (p2 - p1)
a
[1] 75
b = ((1 - p1)*x2 - (1 - p2)*x1) / (p2 - p1)
b
[1] 975

Analytic Solution: Uniform PSA

qunif(0.25, min = 75, max = 975)
[1] 300
qunif(0.75, min = 75, max = 975)
[1] 750

Analytic Solution: Normal PSA

\[ \sigma = \frac{x_2 - x_1}{\Phi^{-1}(p_2)-\Phi^{-1}(p_1)} \]

\[ \mu = \frac{x_1\Phi^{-1}(p_2)-x_2\Phi^{-1}(p_1)}{\Phi^{-1}(p_2)-\Phi^{-1}(p_1)} \]

Analytic Solution: Normal PSA

mu = (qnorm(p2)*x1 - qnorm(p1)*x2) / (qnorm(p2)-qnorm(p1))
mu
[1] 525
sigma = (x2 - x1) / (qnorm(p2) - qnorm(p1))
sigma
[1] 333.59

Analytic Solution: Normal PSA

qnorm(0.25, mean = 525, sd = 333.59)
[1] 300
qnorm(0.75, mean = 525, sd = 333.59)
[1] 750

Analytic Solution: Lognormal PSA

  • Just take log of \(x_1\) and \(x_2\)

\[ \sigma = \frac{\ln(x_2) - \ln(x_1)}{\Phi^{-1}(p_2)-\Phi^{-1}(p_1)} \]

\[ \mu = \frac{\ln(x_1)\Phi^{-1}(p_2)-\ln(x_2)\Phi^{-1}(p_1)}{\Phi^{-1}(p_2)-\Phi^{-1}(p_1)} \]

Analytic Solution: Lognormal PSA

mu = (qnorm(p2)*log(x1) - qnorm(p1)*log(x2)) / (qnorm(p2)-qnorm(p1))
mu
[1] 6.1619
sigma = (log(x2) - log(x1)) / (qnorm(p2) - qnorm(p1))
sigma
[1] 0.67925

Analytic Solution: Lognormal PSA

qlnorm(0.25, mean = 6.1619, sd = 0.67925)
[1] 299.99
qlnorm(0.75, mean = 6.1619, sd = 0.67925)
[1] 749.98

Analytic Solution: Gamma PSA

  • A bit more involved as it involves finding the root of a function.

Analytic Solution: Gamma PSA

  • A bit more involved as it involves finding the root of a function.
gamma_fn <- function(alpha) {
    x1*qgamma(p2,shape = alpha, scale =1) - x2 * qgamma(p1, shape = alpha, scale = 1)
}
curve(gamma_fn, xlim = c(1,10), col = "blue", lwd = 1.5, lty=2)
abline(a=0,b=0)

Analytic Solution: Gamma PSA

  • Root (i.e., point where the function crosses the zero line) seems to be between 2 and 4.
gamma_fn <- function(alpha) {
    x1*qgamma(p2,shape = alpha, scale =1) - x2 * qgamma(p1, shape = alpha, scale = 1)
}
curve(gamma_fn, xlim = c(1,10), col = "blue", lwd = 1.5, lty=2)
abline(a=0,b=0)

Analytic Solution: Gamma PSA

  • We’ll search in this range for our \(\alpha\) value.
gamma_fn <- function(alpha) {
    x1*qgamma(p2,shape = alpha, scale =1) - x2 * qgamma(p1, shape = alpha, scale = 1)
}
curve(gamma_fn, xlim = c(1,10), col = "blue", lwd = 1.5, lty=2)
abline(a=0,b=0)

Analytic Solution: Gamma PSA

alpha_ <- uniroot(gamma_fn,c(2,4))$root
alpha_
[1] 2.4558
calc_beta <- function(x1,p1,alpha) {
    x1 / qgamma(p1,alpha,1)
}

beta_ <- calc_beta(x1 = x1,  p1 = p1, alpha = alpha_)
beta_
[1] 230.16

Analytic Solution: Gamma PSA

qgamma(0.25,shape = 2.4558, scale = 230.16)
[1] 300
qgamma(0.75,shape = 2.4558, scale = 230.16)
[1] 750

Alternative: Optimization

  • General solution that can be used to solve for parameters of many common PSA distributions (e.g., beta, gamma, etc.).

\[\min_{\vec{\theta} \in R}\,\, (F(x_1|\vec{\theta})-p_1)^2+(F(x_2|\vec{\theta})-p_2)^2\]

Alternative: Optimization

  • Requires cumulative distribution function (CDF), \(F\).
  • See Borchers’ great slides for tips on optimization in R.

Some Advice

  • We have found that while analytically correct, this optimization formula is not numerically stable.
  • Transfinite scaling stabilizes optimization.

\[\min_{\vec{\theta} \in R}\,\, (F(x_1|\vec{\theta})-p_1)^2+(F(x_2|\vec{\theta})-p_2)^2\]

Stable Optimization

  • For probabilities, this is the \(\text{logit}(p)=\log \frac{p}{1-p}=\log p - \log (1-p)\).
  • Tweaking parameters to converge can still happen.
  • Example of theory versus practice.

Stable Optimization

  • For probabilities, this is the \(\text{logit}(p)=\log \frac{p}{1-p}=\log p - \log (1-p)\).
  • Tweaking parameters to converge can still happen.
  • Example of theory versus practice.

\[\min_{\vec{\theta} \in R} \sum_{i \in 1,2}\,\, (\text{logit} (F(x_i|\vec{\theta}))-\text{logit} (p_i))^2\]

Transfinite Example

Tf <- function(x, shape, rate)
    pgamma(x,shape,rate,log=TRUE) - 
    pgamma(x,shape,rate,log=TRUE, lower.tail = FALSE)

norm <- function(x1, p1, x2, p2, shape,rate)
    (Tf(x1, shape, rate)-(log(p1)-log(1-p1)) )^2 +
    (Tf(x2, shape, rate)-(log(p2)-log(1-p2)) )^2

fn <- function(x) norm(x1, p1, x2, p2, x[1], x[2])

gamma_optim <- 
  optim(c(0.5, 0.1), # initial parameter guesses
    fn,
    gr = function(x) pracma::grad(fn, x),
    lower=c(-Inf, 1e-5),
    method = "L-BFGS-B",
    control=list(factr=1e-10, maxit=100))$par

Transfinite Example

  • Optimization returns \(\alpha\) and \(1/\beta\) for the gamma distribution example.
# Analytic formula
alpha_
[1] 2.4558
# Optimization
gamma_optim[[1]]
[1] 2.4559
# Analytic formula
beta_
[1] 230.16
# Optimization
1/gamma_optim[[2]]
[1] 230.16

Optimization for Other Distributions

  • Just swap in the distribution function for the distribution you want to match to (e.g., pbeta rather than pgamma).
  • Also make sure the supplied parameter names match the distribution you’re aiming for.
Tf <- function(x, shape, rate)
    pgamma(x,shape,rate,log=TRUE) - 
    pgamma(x,shape,rate,log=TRUE, lower.tail = FALSE)

Summary on Distribution Fitting

  • Use analytical formulas if they exist.
  • Optimize on \(\text{logit}\) scale.
  • Again, avoid \(\log\) on probabilities, use log=TRUE.
  • Plot results for visual check.

Copula PSA Sampling

Sensitivity Analysis for Decision Models

  • It is standard in CEA analyses to explore sensitivity of model outputs and outcomes to variation in model inputs.
  • Many decision problems are complicated by dependencies between model inputs (e.g., costs, utilities, etc.).
  • These correlations may have meaningful impacts on model results and sensitivity.

HIV Model

In the HIV model discussed earlier, patients accrue two types of costs:

  1. Direct medical costs associated with each health state (Healthy, LowCD4, AIDS)
  2. Community medical costs associated with each health state.

HIV Model

  • These costs are likely correlated with each other in meaningful ways.
  • Similarly, other model parameters (e.g., state utility payoffs, hazard rates for treatment strategies, etc.) may also be correlated.

HIV Model

  • How can we reflect this correlation in n-way sensitivity analyses, such as PSAs?

What You Were Taught

  • Nearly all PSA exercises draw from independent distributions.
  • To fully explore the space of correlated parametersl, you need a lot of PSA draws.

What You Were Taught

  • PSA draws of two model parameters
    • \(c_{direct} \sim \text{gamma}(2.5, 1/230)\)
    • \(c_{community} \sim \text{gamma}(1.5, 1/150)\)

What We’ll Teach You

  • \(\text{cor}(c_{direct},c_{community}) = 0.8\)

What We’ll Teach You

Copula-Based PSA Sampling

  • We can easily draw correlated PSA samples via copulas.

  • Copulas allow us to sample from the joint distribution of correlated parameters

\[ F(x_1,...,x_d) = C \big ( F_1(x_1), ..., F_d(x_d) \big ) \]

Copulas

  • Copula: a multivariate cumulative distribution function with uniform marginals.
  • Intuition: the binding “glue” between random variables.

References

Chhatwal, Jagpreet, Suren Jayasuriya, and Elamin H. Elbasha. 2016. “Changing Cycle Lengths in State-Transition Models: Challenges and Solutions.” Medical Decision Making 36 (8): 952–64. https://doi.org/10.1177/0272989X16656165.
Cook, John D. 2010. “Determining Distribution Parameters from Quantiles.” Working paper.
Higham, Nicholas J. 2008. Functions of Matrices: Theory and Computation. SIAM.